33 research outputs found

    Temporal word embeddings for dynamic user profiling in Twitter

    Get PDF
    The research described in this paper focused on exploring the domain of user profiling, a nascent and contentious technology which has been steadily attracting increased interest from the research community as its potential for providing personalised digital services is realised. An extensive review of related literature revealed that limited research has been conducted into how temporal aspects of users can be captured using user profiling techniques. This, coupled with the notable lack of research into the use of word embedding techniques to capture temporal variances in language, revealed an opportunity to extend the Random Indexing word embedding technique such that the interests of users could be modelled based on their use of language. To achieve this, this work concerned itself with extending an existing implementation of Temporal Random Indexing to model Twitter users across multiple granularities of time based on their use of language. The product of this is a novel technique for temporal user profiling, where a set of vectors is used to describe the evolution of a Twitter user’s interests over time through their use of language. The vectors produced were evaluated against a temporal implementation of another state-of-the-art word embedding technique, the Word2Vec Dynamic Independent Skip-gram model, where it was found that Temporal Random Indexing outperformed Word2Vec in the generation of temporal user profiles

    Personalised multilingual hypertext retrieval: An overview

    Get PDF
    The aims of the workshop on Personalised Multilingual Hypertext Retrieval (PMHR) are twofold: to set the scene in this challenging area, allowing the diïŹ€erent communities engaged in related research topics to meet and to determine a program of actions to undertake; to devise a strategy for the evaluation of PMHR systems, which should deïŹne the collection of resources to use to evaluate such systems together with the evaluation metrics to use. The workshop results will be of use in the design of personalised tools that can help end-users fully beneïŹt from the use of distributed multilingual hypertext content

    A proposal for the evaluation of adaptive information retrieval systems using simulated interaction

    Get PDF
    The Centre for Next Generation Localisation (CNGL) is involved in building interactive adaptive systems which combine Information Retrieval (IR), Adaptive Hypermedia (AH) and adaptive web techniques and technologies. The complex functionality of these systems coupled with the variety of potential users means that the experiments necessary to evaluate such systems are difficult to plan, implement and execute. This evaluation requires both component-level scientific evaluation and user-based evaluation. Automated replication of experiments and simulation of user interaction would be hugely beneficial in the evaluation of adaptive information retrieval systems (AIRS). This paper proposes a methodology for the evaluation of AIRS which leverages simulated interaction. The hybrid approach detailed combines: (i) user-centred methods for simulating interaction and personalisation; (ii) evaluation metrics that combine Human Computer Interaction (HCI), AH and IR techniques; and (iii) the use of qualitative and quantitative evaluations. The benefits and limitations of evaluations based on user simulations are also discussed

    Applying digital content management to support localisation

    Get PDF
    The retrieval and presentation of digital content such as that on the World Wide Web (WWW) is a substantial area of research. While recent years have seen huge expansion in the size of web-based archives that can be searched efficiently by commercial search engines, the presentation of potentially relevant content is still limited to ranked document lists represented by simple text snippets or image keyframe surrogates. There is expanding interest in techniques to personalise the presentation of content to improve the richness and effectiveness of the user experience. One of the most significant challenges to achieving this is the increasingly multilingual nature of this data, and the need to provide suitably localised responses to users based on this content. The Digital Content Management (DCM) track of the Centre for Next Generation Localisation (CNGL) is seeking to develop technologies to support advanced personalised access and presentation of information by combining elements from the existing research areas of Adaptive Hypermedia and Information Retrieval. The combination of these technologies is intended to produce significant improvements in the way users access information. We review key features of these technologies and introduce early ideas for how these technologies can support localisation and localised content before concluding with some impressions of future directions in DCM

    Multilingual adaptive search for digital libraries

    Get PDF
    This paper describes a framework for Adaptive Multilingual Information Retrieval (AMIR) which allows multilingual resource discovery and delivery using on-the-ïŹ‚y machine translation of documents and queries. Result documents are presented to the user in a contextualised manner. Challenges and affordances of both Adaptive and Multilingual IR, with a particular focus on Digital Libraries, are detailed. The framework components are motivated by a series of results from experiments on query logs and documents from The European Library. We conclude that factoring adaptivity and multilinguality aspects into the search process can enhance the user’s experience with online Digital Libraries

    Dataset creation framework for personalized type-based facet ranking tasks evaluation

    Get PDF
    Faceted Search Systems (FSS) have gained prominence in many existing vertical search systems. They provide facets to assist users in allocating their desired search target quickly. In this paper, we present a framework to generate datasets appropriate for simulation-based evaluation of these systems. We focus on the task of personalized type-based facet ranking. Type-based facets (t-facets) represent the categories of the resources being searched in the FSS. They are usually organized in a large multilevel taxonomy. Personalized t-facet ranking methods aim at identifying and ranking the parts of the taxonomy which reflects query relevance as well as user interests. While evaluation protocols have been developed for facet ranking, the problem of personalising the facet rank based on user profiles has lagged behind due to the lack of appropriate datasets. To fill this gap, this paper introduces a framework to reuse and customise existing real-life data collections. The framework outlines the eligibility criteria and the data structure requirements needed for this task. It also details the process to transform the data into a ground-truth dataset. We apply this framework to two existing data collections in the domain of Point-of-Interest (POI) suggestion. The generated datasets are analysed with respect to the taxonomy richness (variety of types) and user profile diversity and length. In order to experiment with the generated datasets, we combine this framework with a widely adopted simulated user-facet interaction model to evaluate a number of existing personalized t-facet ranking baselines

    A probabilistic approach to personalize type-based facet ranking for POI suggestion

    Get PDF
    Faceted Search Systems (FSS) have become one of the main search interfaces used in vertical search systems, offering users meaningful facets to refine their search query and narrow down the results quickly to find the intended search target. This work focuses on the problem of ranking type-based facets. In a structured information space, type-based facets (t-facets) indicate the category to which each object belongs. When they belong to a large multi-level taxonomy, it is desirable to rank them separately before ranking other facet groups. This helps the searcher in filtering the results according to their type first. This also makes it easier to rank the rest of the facets once the type of the intended search target is selected. Existing research employs the same ranking methods for different facet groups. In this research, we propose a two-step approach to personalize t-facet ranking. The first step assigns a relevance score to each individual leaf-node t-facet. The score is generated using probabilistic models and it reflects t-facet relevance to the query and the user profile. In the second step, this score is used to re-order and select the sub-tree to present to the user. We investigate the usefulness of the proposed method to a Point Of Interest (POI) suggestion task. Our evaluation aims at capturing the user effort required to fulfil her search needs by using the ranked facets. The proposed approach achieved better results than other existing personalized baselines

    Where should I go? A deep learning approach to personalize type-based facet ranking for POI suggestion

    Get PDF
    In a faceted search system, type-based facets (t-facets) represent the categories of the resources being searched. Ranking algorithms are needed to select and promote the most relevant t-facets. However, as these are extracted from large multi-level taxonomies, they are impossible to show entirely to the user. Facet ranking is usually employed to filter out irrelevant facets for the users. Existing facet ranking methods neglect both the hierarchical structure of t-facets and the user historical preferences. This research introduces a personalized t-facet ranking that addresses both issues. During a first step, a Deep Neural Network (DNN) model is trained to assign a relevance score to each t-facet based on three groups of relevance features. The score reflects the t-facet relevance to the user, the input query, and its general importance in the dataset. Subsequently, these scores are aggregated and the t-facets are re-organised into a smaller sub-tree to be presented to the user. Our approach aims at minimizing the effort required by the user to reach their intended search target. This is measured in terms of number of clicks the user has to perform on the t-facet tree to reach a relevant resource. The approach is applied to a Point-Of-Interest suggestion task. We solve the problem by ranking the categories of the venues as t-facets. The evaluation compares our DNN-based approach with other existing baselines and investigates the individual contribution of each group of features. Our experiment has demonstrated that the proposed personalized deep learning model leads to better t-facet rankings and minimized user effort

    Personalizing type-based facet ranking using BERT embeddings

    Get PDF
    In Faceted Search Systems (FSS), users navigate the information space through facets, which are attributes or meta-data that describe the underlying content of the collection. Type-based facets (aka t-facets) help explore the categories associated with the searched objects in structured information space. This work investigates how personalizing t-facet ranking can minimize user effort to reach the intended search target. We propose a lightweight personalisation method based on Vector Space Model (VSM) for ranking the t-facet hierarchy in two steps. The first step scores each individual leaf-node t-facet by computing the similarity between the t-facet BERT embedding and the user profile vector. In this model, the user's profile is expressed in a category space through vectors that capture the users' past preferences. In the second step, this score is used to re-order and select the sub-tree to present to the user. The final ranked tree reflects the t-facet relevance both to the query and the user profile. Through the use of embeddings, the proposed method effectively handles unseen facets without adding extra processing to the FSS. The effectiveness of the proposed approach is measured by the user effort required to retrieve the sought item when using the ranked facets. The approach outperformed existing personalization baselines
    corecore